strategic regression
A Bandit Framework for Strategic Regression
We consider a learner's problem of acquiring data dynamically for training a regression model, where the training data are collected from strategic data sources. A fundamental challenge is to incentivize data holders to exert effort to improve the quality of their reported data, despite that the quality is not directly verifiable by the learner. In this work, we study a dynamic data acquisition process where data holders can contribute multiple times.
Reviews: A Bandit Framework for Strategic Regression
The question studied in the paper is interesting, and borrowing the idea from peer prediction to use the other arms' predictions as an unbiased estimator of the quality of one arm's prediction is a nice idea (in particular, because those arms need to be incentivized enough to make reasonably accurate predictions). However, the paper focuses too much on presenting "bells and whistles" rather than giving a deeper understanding of the basic (and main) results. Perhaps reorganizing the paper to only briefly mention the computational/privacy aware variants and giving both more intuition and technical content describing the main result (namely, that there exist \alpha-BNE with small regret) would focus the paper and give the reader a cleaner message of what the paper is doing. This tact would have the added benefit that the reader might be able to better assess the "quantitative" consequences of this work, in that it would leave more room for the authors to ruminate on how much better or worse these bounds are than what one could get in the non-strategic setting, or in various trivial simplifications/special cases of this model. As the paper stands, this reviewer finds it difficult to assess from the main body of the paper alone the technical contribution of the paper (and whether the results follow from a mild reworking of standard proofs or need substantial, new ideas). It is also difficult to assess a theory paper which gives not even a sketch or an outline of a proof in the main body of the paper.
A Bandit Framework for Strategic Regression
We consider a learner's problem of acquiring data dynamically for training a regression model, where the training data are collected from strategic data sources. A fundamental challenge is to incentivize data holders to exert effort to improve the quality of their reported data, despite that the quality is not directly verifiable by the learner. In this work, we study a dynamic data acquisition process where data holders can contribute multiple times. We propose a Strategic Regression-Upper Confidence Bound (SR-UCB) framework, an UCB-style index combined with a simple payment rule, where the index of a worker approximates the quality of his past contributions and is used by the learner to determine whether the worker receives future work. For linear regression and certain family of non-linear regression problems, we show that SR-UCB enables a $O(\sqrt{\log T/T})$-Bayesian Nash Equilibrium (BNE) where each worker exerting a target effort level that the learner has chosen, with $T$ being the number of data acquisition stages.